105 research outputs found
Behavioral constraint template-based sequence classification
In this paper we present the interesting Behavioral Constraint Miner (iBCM), a new approach towards classifying sequences. The prevalence of sequential data, i.e., a collection of ordered items such as text, website navigation patterns, traffic management, and so on, has incited a surge in research interest towards sequence classification. Existing approaches mainly focus on retrieving sequences of itemsets and checking their presence in labeled data streams to obtain a classifier. The proposed iBCM approach, rather than focusing on plain sequences, is template-based and draws its inspiration from behavioral patterns used for software verification. These patterns have a broad range of characteristics and go beyond the typical sequence mining representation, allowing for a more precise and concise way of capturing sequential information in a database. Furthermore, it is possible to also mine for negative information, i.e., sequences that do not occur. The technique is benchmarked against other state-of-the-art approaches and exhibits a strong potential towards sequence classification. Code related to this chapter is available at: http://feb.kuleuven.be/public/u0092789/status: publishe
Can recurrent neural networks learn process model structure?
Various methods using machine and deep learning have been proposed to tackle
different tasks in predictive process monitoring, forecasting for an ongoing
case e.g. the most likely next event or suffix, its remaining time, or an
outcome-related variable. Recurrent neural networks (RNNs), and more
specifically long short-term memory nets (LSTMs), stand out in terms of
popularity. In this work, we investigate the capabilities of such an LSTM to
actually learn the underlying process model structure of an event log. We
introduce an evaluation framework that combines variant-based resampling and
custom metrics for fitness, precision and generalization. We evaluate 4
hypotheses concerning the learning capabilities of LSTMs, the effect of
overfitting countermeasures, the level of incompleteness in the training set
and the level of parallelism in the underlying process model. We confirm that
LSTMs can struggle to learn process model structure, even with simplistic
process data and in a very lenient setup. Taking the correct anti-overfitting
measures can alleviate the problem. However, these measures did not present
themselves to be optimal when selecting hyperparameters purely on predicting
accuracy. We also found that decreasing the amount of information seen by the
LSTM during training, causes a sharp drop in generalization and precision
scores. In our experiments, we could not identify a relationship between the
extent of parallelism in the model and the generalization capability, but they
do indicate that the process' complexity might have impact
Mixed-Paradigm Process Modeling with Intertwined State Spaces
Business process modeling often deals with the trade-off between comprehensibility and flexibility. Many languages have been proposed to support different paradigms to tackle these characteristics. Well-known procedural, token-based languages such as Petri nets, BPMN, EPC, etc. have been used and extended to incorporate more flexible use cases, however the declarative workflow paradigm, most notably represented by the Declare framework, is still widely accepted for modeling flexible processes. A real trade-off exists between the readable, rather inflexible procedural models, and the highly-expressive but cognitively demanding declarative models containing a lot of implicit behavior. This paper investigates in detail the scenarios in which combining both approaches is useful, it provides a scoring table for Declare constructs to capture their intricacies and similarities compared to procedural ones, and offers a step-wise approach to construct mixed-paradigm models. Such models are especially useful in the case of environments with different layers of flexibility and go beyond using atomic subprocesses modeled according to either paradigm. The paper combines Petri nets and Declare to express the findings
Predictive Process Model Monitoring using Recurrent Neural Networks
The field of predictive process monitoring focuses on modelling future
characteristics of running business process instances, typically by either
predicting the outcome of particular objectives (e.g. completion (time), cost),
or next-in-sequence prediction (e.g. what is the next activity to execute).
This paper introduces Processes-As-Movies (PAM), a technique that provides a
middle ground between these predictive monitoring. It does so by capturing
declarative process constraints between activities in various windows of a
process execution trace, which represent a declarative process model at
subsequent stages of execution. This high-dimensional representation of a
process model allows the application of predictive modelling on how such
constraints appear and vanish throughout a process' execution. Various
recurrent neural network topologies tailored to high-dimensional input are used
to model the process model evolution with windows as time steps, including
encoder-decoder long short-term memory networks, and convolutional long
short-term memory networks. Results show that these topologies are very
effective in terms of accuracy and precision to predict a process model's
future state, which allows process owners to simultaneously verify what linear
temporal logic rules hold in a predicted process window (objective-based), and
verify what future execution traces are allowed by all the constraints together
(trace-based)
CORE: A Few-Shot Company Relation Classification Dataset for Robust Domain Adaptation
We introduce CORE, a dataset for few-shot relation classification (RC)
focused on company relations and business entities. CORE includes 4,708
instances of 12 relation types with corresponding textual evidence extracted
from company Wikipedia pages. Company names and business entities pose a
challenge for few-shot RC models due to the rich and diverse information
associated with them. For example, a company name may represent the legal
entity, products, people, or business divisions depending on the context.
Therefore, deriving the relation type between entities is highly dependent on
textual context. To evaluate the performance of state-of-the-art RC models on
the CORE dataset, we conduct experiments in the few-shot domain adaptation
setting. Our results reveal substantial performance gaps, confirming that
models trained on different domains struggle to adapt to CORE. Interestingly,
we find that models trained on CORE showcase improved out-of-domain
performance, which highlights the importance of high-quality data for robust
domain adaptation. Specifically, the information richness embedded in business
entities allows models to focus on contextual nuances, reducing their reliance
on superficial clues such as relation-specific verbs. In addition to the
dataset, we provide relevant code snippets to facilitate reproducibility and
encourage further research in the field.Comment: Accepted to EMNLP 2023 main conferenc
- …